Design and Analysis of Parallel N-Queens on Reconfigurable Hardware with Handel-C and MPI

نویسندگان

  • Vikas Aggarwal
  • Ian A. Troxel
چکیده

N-Queens is a classical problem that for many years has been popular for the benchmarking of processing, memory, and communications architectures on high-performance computing systems. The problem is combinatorially hard and involves placing N queens on an N×N chessboard so that no queen can attack any other. The N-Queens problem is well suited for the development and benchmarking of search algorithms and the architectures on which to map them. Traditional methods to solve this problem use backtracking, whereby a queen is iteratively placed at a safe location in each column until no safe location exists and then adjustments are made to the positions of previously placed queens until a full solution for the board is found. The backtracking approach has exponential computational complexity, in that the complexity of the algorithm and the number of solutions grow exponentially with the size of the problem. The problem can be partitioned for parallel processing in several fashions, such as embarrassingly parallel methods for finding separate and distinct solutions for a given board size, as well as domain-decomposition methods where a single solution is processed in parallel by cooperating resources each dedicated to components of the data structure (e.g. columns), exchanging data and synchronizing with one another on queen placement. Although this benchmark has been primarily used for the study of search algorithms and the analysis of conventional computer system architectures, it offers interesting challenges for implementation and benchmarking on reconfigurable computing systems featuring programmable logic devices such as FPGAs. One of the key challenges in the benchmarking and exploitation of new and emerging forms of reconfigurable computing systems is the hardware design strategy. Solutions are often created in a custom fashion using hardware description languages and other tools from integrated circuit design. However, this design flow can be cumbersome for application developers attempting to solve problems that require highperformance computing. The structure of the underlying algorithm can be significantly different when posed for a reconfigurable system versus a conventional one, and the porting and adaptation of code from highlevel language implementations (e.g. in C or Fortran) to hardware designs can be extremely challenging. In order to realize the performance gains these new systems offer, algorithms designed for general-purpose processor architectures must typically undergo a redesign with the new programming paradigm in mind. Several tools are available to help ease the burden of this migration from processor code to reconfigurable hardware logic, including tools based on extensions to traditional programming languages such as Handel-C and Streams-C. However, these tools are still relatively new and thus much room remains for assessing their strengths and weaknesses in mapping key algorithms to various target architectures. This presentation showcases the development of a parallel backtracking approach to the N-Queens problem designed using the Handel-C tool from Celoxica with emphasis on design strategy, performance analysis, tradeoffs, and lessons learned. Our solutions to the N-Queens problem exploit parallel hardware structures within and between multiple FPGAs, the latter by means of interprocessor communication with the Message Passing Interface (MPI), the dominant programming model in cluster computing. Experiments are conducted using a reconfigurable computing cluster of four nodes in our lab. Each node in the cluster is a dual-Xeon server that houses a Celoxica RC1000 reconfigurable computing board containing a Xilinx Virtex 2000E device, and the nodes are interconnected with high-speed networks including Gigabit Ethernet, InfiniBand, and SCI. The results demonstrate performance tradeoffs of parallel search problems such as NQueens on reconfigurable architectures in commodity-based clusters, in terms of problem size, device size, and decomposition strategy. Included are lessons learned in the design and analysis of solutions to this problem, as well as comparisons with implementations on conventional computing systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How Programmable is Reconfigurable Hardware? : A Design Model for Reconfigurable Architectures

With large-capacity FPGAs, such as the Xilinx Virtex family, complex systems can now be constructed from reconfigurable hardware, and sophisticated designs are more easily implemented through C-language hardware compilers, such as Handel-C [1]. However, improved language support is not enough, as effective system design needs structured methods and high-level design support, which a language by...

متن کامل

A System on Chip Design Framework for Prime Number Validation Using Reconfigurable Hardware

This paper presents a System on Chip (SoC) design framework for prime number validation which targets reconfigurable hardware. The primality test is crucial for most security systems using public-key schemes. It has been recognised that strong prime number generation is important, and prime validation is an intrinsic part of the generation. Our main contributions include: (1) A design method fo...

متن کامل

Coprocessor design to support MPI primitives in configurable multiprocessors

The Message Passing Interface (MPI) is a widely used standard for interprocessor communications in parallel computers and PC clusters. Its functions are normally implemented in software due to their enormity and complexity, thus resulting in large communication latencies. Limited hardware support for MPI is sometimes available in expensive systems. Reconfigurable computing has recently reached ...

متن کامل

Task-Parallel Programming of Reconfigurable Systems

This paper presents task-parallel programming, a style of application development for reconfigurable systems. Task-parallel programming enables efficient interaction between concurrent hardware and software tasks. In particular, it supports description of communication and computation tasks running in parallel to allow effective implementation of designs where data transfer time between hardwar...

متن کامل

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004